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1. INTRODUCTION 


Extensive morphological and biochemical changes occur in coronavirus (CoV)- 
infected cells. Nevertheless, there is a limited knowledge of the biochemical events 
occurring in host cells and in the biochemistry of the infection. Infections by CoVs cause 
alterations in host cells in transcription and translation patterns, in the cell cycle, in the 
cytoskeleton, and in apoptosis pathways. In addition, in the host, CoV infection may 
cause inflammation, alterations of the immune response, of cytokine and chemokine 
levels, of interferon (INF)-induced gene expression and of stress responses, and 
modification of coagulation pathways. This chapter will focus on selected biochemical 
aspects of CoV replication and transcription with special attention to the interaction 
between cell and viral factors. 


2. INFLUENCE OF VIRAL AND CELLULAR PROTEINS IN CoV 
REPLICATION 


2.1. Nuclear Localization of CoV Proteins 


There are at least three CoV proteins that have been localized within the nucleus of 
infected cells: nucleoprotein (N), 3b, and nspl (Table 1). The nucleolus has been 
implicated in many aspects of cell biology that include functions such as ribosomal rRNA 
synthesis and ribosome biogenesis, gene silencing, senescence, and cell cycle 
regulation.'> The nucleolus contains different factors including nucleolin, fibrillarin, 
spectrin, B23, rRNA, and ribosomal proteins S5 and L9. Viruses interact with the 
nucleolus and its antigens; viral proteins co-localize with factors such as nucleolin, B23, 
and fibrillarin and cause their redistribution during infection.” N proteins from CoV genus © 
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Table 1. NiV proteins in the nucleus of infected cells. 


PROTEIN VIRUS REFERENCE 

N IBV Chen, et al., 2003 
MHV Wurm, et al., 2001 
SARS-CoV Timani, et al., 2004 
TGEV Wurm, et al., 2001 
PRRSV Rowland, et al.,1999 
EAV Tims, et al., 2002 

3b SARS-CoV Yuan, et al., 2005 

nsp1 EAV van de Meer, et al., 1999 


(transmissible gastroenteritis virus, TGEV),° B (mouse hepatitis virus, MHV, and severe 
and acute respiratory syndrome coronavirus, SARS-CoV),°* and y (infectious bronchitis 
virus, IBV),’ and also from two arteriviruses [porcine respiratory and reproductive 
syndrome virus (PRRSV) and equine arteritis virus (EAV)]'”"' localize in the nucleolus, 
and this may be a common feature among nidovirus N proteins that influences host cell 
proliferation.® The association of N protein with the nucleus may be cell dependent. In 
fact, until now, N protein has been located in the nucleus of LLC-PK1 and Vero cells 
transfected with plasmids expressing N protein®, but. not in ST cells transfected with 
plasmids expressing N protein or infected with TGEV -: Overall, these data indicate that 
the presence of N protein in the nucleus of the infected cells might be of functional 
significance. 

SARS-CoV ORF 3b encodes a protein of 154 amino acids, lacking similarities to any 
known protein. Protein 3b is predominantly localized in the nucleolus. A functional 
nuclear localization signal is located in amino acids 134 to 154. Ectopic over-expression 
of protein 3b in Vero, 293, and COS-7 cells induced cell cycle arrest at Go/G1 phase." 
EAV nsp1 also has been localized in the nucleus.!°'4 Therefore, in total, at least three 
nidovirales proteins (N, 3b and nsp1) have been detected in the nucleus of infected cells, 
suggesting that nidovirales may modify cell behavior through the nucleus. 


2.2. CoV Genome Replication 


In CoV replication, recognition of RNA genome 5’ and 3’ ends by viral and cellular 
proteins is most likely essential. Furthermore, the interaction of these ends probably is a 
requirement for replication and transcription, as these are processes that must be initiated 
at the 3’ end of the genome, and it has been shown that these processes are influenced by 
sequences mapping at the 5’ end of the genome.'*'® There may be a direct interaction 
between the CoV 5’ end 3’ ends, as predicted for MHV and TGEV RNA genomes in the 
absence of protein using computer programs.'”'* Nevertheless, this direct interaction 
seems unlikely inside cells in the presence of cell and viral proteins. In fact, during 
genome synthesis, the first end synthesized (3’) will be most likely immediately folded 
and nonspecifically covered by proteins, such as the N protein or nsp9, or by proteins 
binding specific RNA motifs with characteristic secondary structures. In fact, the 
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postulated cross-talk between 5’ and 3’ ends and CoV replication has been shown in our 
laboratory.'” Precipitation of digoxigeninated 3’-ends by biotinylated 5’-ends using 
streptavidin sepharose beads and, the reverse (precipitation of digoxigeninated 5’-ends 
with biotinylated 3’-ends) have been shown. Cross-talk between the 5’ and 3’ ends of 
TGEV genome has been observed for CoV genome ends and only require the presence of 
cell proteins. 

The N protein probably has a prominent role in CoV replication as it influences 
many viral and cellular processes. The role of the N protein is most likely constrained by 
its propensity to self-assemble to form the capsid and also by its phosphorylation state. N 
protein activity has to be a consequence of its interaction with other viral and cellular 
proteins and with virus and host cell nucleic acids. CoV N protein is associated with the 
replicase complex in double-membrane structures derived from the endoplasmic 
reticulum” and also binds to genome RNA forming the nucleocapsid.*’*? The 
nucleocapsid binds to the M protein carboxy-terminus in the endoplasmic reticulum (ER), 
Golgi and intermediate compartment (ERGIC) membranes.”!”*”*”° Particles bud as 
immature virions with annular large nucleocapsids. Immature virions are transported 
through the Golgi compartment, where a major rearrangement of the nucleocapsid takes 
Place, giving rise to secretory vesicles containing mature virions with electrondense 
cores.” 

The N protein has a variable size in different CoVs (Fig. 1). Self-interactions of N 
proteins was first described in MHV.”’ The N protein has conserved secondary structures, 
including highly conserved a helices, and a highly conserved serine-rich domain 
including the repetitive sequence SSDNSRSRSQSRSRSR”® (Fig. 1). Within this area, 
several active N protein domains have been mapped, such as the RNA binding domain of 
the IBV genome,” the oligomerization binding domain (amino acids 184-196),*°?! and 
the M protein binding domain (amino acids 168-208), which is also part of the N 
protein oligomerization domain. These protein sequences may be crucial in maintaining 
the N protein in a correct conformation. In fact, deletion of the 168-208 aa region results 
in the complete loss of N protein dimerization. 

Phosphorylation has been shown to cause conformational changes in MHV N protein 
structure.’’ TGEV and IBV phosphoserine residues have been mapped within the CoV N 
protein primary and secondary structures. TGEV N protein serines 9, 156, 254, and 256 
are phosphorylated in infected cells,'*°* while in IBV, N protein phosphorylation sites 
have been localized to serines 190, 192, 379, and threonine 378.” CoVs N proteins 
present a conserved pattern of secondary structural elements, and a strong correlation has 
been observed between the MHV N protein three-domain organization and the predicted 
structure. N protein domains I and III are the most unstructured and divergent between 
CoV, while domain II is more conserved. Interestingly, TGEV N serines 156, 254, and 
256 were localized to domain II, adjacent to conserved secondary elements 83, B6, and 
a7, respectively. Therefore, phosphorylation in these serine residues could affect the 
structure of these secondary elements by the introduction of negative charges in a basic 
environment***? and affect N protein RNA binding activity. 

IBV N protein phosphorylation has been localized in sites distinct from those 
identified in TGEV, based on sequence comparisons. This apparent discrepancy could be 
explained by intrinsic differences between CoV species. The relevance of the identified 
TGEV N protein phosphoserines has been analyzed by site-directed mutagenesis using 
the TGEV infectious cDNA clone.** Mutagenesis of all four TGEV phosphorylated 
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serines to alanine did not prevent TGEV rescue from infectious cDNA nor lead to a 
significant TGEV titer reduction. This mutation may affect the binding of CoV N protein 
to RNA mediated by the amino terminus of this protein”? Additional work is in progress 
to study the role of TGEV N protein phosphorylation. 

The requirement of N protein for virus replication and transcription is debated. 
Certain observations suggest that N protein plays a role in replication,’’*’ while others 
using either CoV or arterivirus systems** claim that N protein is not essential. Using 
three TGEV derived replicons, two containing N gene in cis, and another one lacking this 
gene (Fig. 2), it has been clearly shown that TGEV replicon in the absence of N protein, 
provided either in cis or in trans, resulted in 50-fold greater levels of a reporter 
subgenomic RNA (gene 7) than background levels (Fig. 2).“° Interestingly, when N 
protein is provided in cis, replication-transcription increases 100-fold over the 
levels in the absence of N protein. If N protein is exclusively provided in trans, 
replication-transcription levels increased 10-fold more (i.e., 1000-fold over levels 
in the absence of N protein). If N protein is in addition provided in cis, ampli- 
fication levels do not increase over those reached when the N protein is only provided 
in trans. Two groups have shown that background levels of CoV transcription have 
been observed in the absence of N protein. “°° Nevertheless, a substantial increase 
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Figure 1. Scheme of N protein from different coronaviruses. The organization of N protein from four 
representative CoVs of genus o (TGEV), genus B (MHV and SARS-CoV), and genus y (IBV) is indicated (not 
to scale). Conserved predicted structural elements are joined by gray shadowing zones. The three-domain 
organization proposed for MHV N protein by P. Master’s group is indicated as open boxes over the MHV N 
protein (L, IL, and III). P, phosphorylation sites. ac, protein domains with highly conserved alpha structure. AA, 
amino acid. NLS, nuclear localization signal. RBD, RNA binding domain. OMD, oligomerization domain. 
MPBD, M protein binding domain. S-S, disulfide bridge. 
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in CoV transcription is observed by providing N protein either in cis or in trans. The 
increase in reporter gene expression could be due to an increase in the replication, in the 
transcription levels, or to both. There is a general agreement that the presence of N 
protein enhances the rescue of infectious virus from cDNA clones generated from 
different CoVs, such as IBV,*” human coronavirus (HCoV)-229E,**8 and TGEV using 
RNA in vitro transcripts” or replicons.” 


3. CoV TRANSCRIPTION 


CoV transcription, and in general transcription in the Nidovirales order, is an RNA- 
dependent RNA process which includes a discontinuous step during the production of 
subgenomic mRNAs.*” *! This transcription process ultimately generates a nested set of 
subgenomic mRNAs that are 5’- and 3’-coterminal with the virus genome. The common 
5’-terminal leader sequence, 93 nucleotides (nt) in TGEV, is fused to the 5’ end of the 
mRNA coding sequence (body) by a discontinuous transcription mechanism. Sequences 
preceding each gene represent signals for synthesis of subgenomic mRNAs (sgmRNAs). 
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Figure 2. Expression of gene 7 in the presence of N protein. To study the effect of N protein in TGEV-derived 
replicon activity, the amount of mRNA7, expressed as relative units, was determined by real-time RT-PCR with 
specific oligonucleotides in RNA samples isolated from standard BHK-pAPN (BHK) or BHK-pAPN 
expressing N protein (BHK + N). Cells were transfected with either a non replicative cDNA clone (NO REP), 
two replicons that express the N protein (REP | and REP 2), or a replicon that does not encode protein N (REP 
3). N protein was provided in cis by the replicons, or in trans using a Sinbis virus replicon when indicated (+). -, 
absence of N protein. The results indicate the mean values from three experiments, with standard deviations 
shown as error bars. 
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These are the transcription-regulating sequences (TRSs) that include a conserved core 
sequence (CS; 5’-CUAAAC-3’), identical in all TGEV genes (CS-B), and the 5’ and 3’ 
flanking sequences (5’ TRS and 3’ TRS, respectively) that regulate transcription” (Fig. 
3). As this CS sequence is also found at the 3’ end of the leader sequence (CS-L), it could 
base pair with the nascent minus strand complementary to each CS-B (cCS-B). In fact, 
the requirement for base pairing during transcription has been formally demonstrated in 
arteriviruses**** and CoVs by experiments in which base pairing between CS-L and the 
complement of CS-B was engineered in infectious genomic cDNAs.” The canonical CS 
was nonessential for the generation of subgenomic mRNAs (sgmRNAs), but its presence 
led to transcription levels at least 10°-fold higher than its absence. The data obtained are 
compatible with a model of transcription that includes three steps (Fig. 3): (1) formation 
of 5’-3’ complexes in the genomic RNA, (ii) scanning of base pairing of the nascent (-) 
RNA strand by the TRS-L, and (iii) template switch during synthesis of the negative 
strand to complete the (-) sgRNA. This template switch takes place after copying the CS 
sequence and was predicted in silico based on a high base pairing score between the 
nascent (-) RNA strand and the TRS-L. 

The role in transcription of four nucleotides immediately flanking the CS both at the 
5’ and 3’ ends has been studied using a transcriptionally inactive canonical CS (CS-S2) 
internal to the S gene.*° The rationale for selecting 5’ and 3’ TRS flanking sequences 
consisting of four nucleotides comes from the results of an in silico analysis showing that 
to predict both viral mRNAs and alternative mRNAs at noncanonical junction sites, an 
optimal TRS-L should include the CS plus four nucleotides flanking the CS at both ends. 
These predictions have been supported by experimental data performed by reverse 
genetic analysis of the sequences immediately flanking CS-S2. A good correlation was 
observed between the free energy of the TRS-L and cTRS-B duplex formation and the 
levels of subgenomic mRNA-S2, demonstrating that base pairing between leader and 
body beyond the CS was a determinant in the regulation of CoV transcription. In TRS 
mutants with increasing complementarity between TRS-L and cTRS-B, a tendency to 


reach a plateau in AG values was observed, suggesting that a more precise definition of 
the TRS limits might be proposed, consisting of the central CS and approximately four 
nucleotides flanking 5’ and 3’ the CS. Sequences downstream of the CS exert a stronger 
influence on the template-switching decision in accordance with a model of polymerase 
strand-transfer and template-switching during minus strand synthesis. 

According to the working model of transcription proposed by our laboratory (Fig. 3), 
the first step is the interaction of the leader TRS with a complex presumably formed by 
the replicase, the helicase, the nascent RNA of negative polarity, and other viral and 
cellular proteins involved in transcription. Candidate proteins have been reported by 
several laboratories. On the viral side, essential proteins in transcription are the RNA- 
dependent RNA-polymerse (RdRp), and the helicase (Hel). In addition, N protein 
probably increases basal transcription (see above) and nsp! has clearly been involved in 
arterivirus transcription.*”** It has also been suggested that NendoU may play a role by 
specifically cutting double stranded RNA generated (transcriptive intermediates) during 
the synthesis of the nascent RNA of negative polarity. NendoU nuclease has a strong 
preference for cleavage at GU(U) sequences in double-stranded RNA substrates.” It 
has been suggested that the GU(U) sequence at the 3’ terminus of nascent minus-strand 
RNAs, which corresponds to conserved AAC nucleotides in the core of the CoV gene 
TRSs elements, might be substrate of this activity; therefore, NendoU activity might be 
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involved in the transcription of subgenomic mRNAs. Data from our laboratory in which 
we analyzed around 90 different sgRNAs generated during the mutagenesis of a TGEV 
CS,* and also from other laboratories,’ support the functional relevance of the AAC 
sequence in transcription, but further studies are required to provide a direct link to the 
activities of enzymes such as the uridylate-specific endoribonuclease. 

In addition, cell proteins most likely play a role in CoV transcription regulation. In 
fact, there are data directly involving heterogeneous nuclear ribonucleoprotein (hnRNP) 
A1,°" hnRNP I (PTB),” and the elongation factor eEF-1 in CoV transcription.“ 
Furthermore, proteins such as p100 kDa coactivator and annexin A2 may be involved in 
CoV transcription as we have shown that these proteins bind to TRS sequences.”° In the 
arteriviruses, the p100 kDa coactivator interacts with nsp1 involved in transcription and 
may regulate this activity.*® 
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Figure 3. Three-step working model of coronavirus transcription. (I) 5’-3’ complex formation step. Proteins 
binding the 5’ and 3’ end TGEV sequences are represented by ellipsoids. Leader sequence is indicated with 
dark gray bars, CS sequences are indicated with a clear bar. An, poly A tail. (II) Base pairing scanning step. 
Minus strand RNA is in a lighter color compared with positive-strand RNA. The transcription complex is 
represented by the hexagon. Vertical dotted bars represent scanning of base pairing by the TRS-L sequence in 
the transcription process. Vertical solid bars indicate complementarity between the genomic (gRNA) and the 
nascent minus strand. Un, poly U tail. (III) The synthesis of the negative strand can continue to make a longer 
sgRNA (III left), or a template switch step can take place (III right) as indicated in the text. The thick arrow 
indicates the switch in the template made by the transcription complex to complete the synthesis of (-) sgRNA. 
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CoV genomic RNA interacts with at least three proteins: hnRNP A1,°!*°° 
PTB,” and N*’**”° and may mediate the formation of complexes between the leader 
TRS and the transcription complex at the body TRS. Formation of cyclic complexes 
could in principle be mediated by the interaction among these proteins. In fact, binding 
between hnRNPAI and PTB,”” hnRNP A1 and N protein,’ and PTB and N protein™ 
have been documented. Many of these biochemical interactions have been reported in the 
past eight years and their role can now be reinterpreted in the context of the transcription 
model implying discontinuous RNA synthesis during the production of the negative 
strand. A role in transcription has been assigned to different proteins: 

(i) hnRNPA1 protein binds to the complementary strand (negative-polarity) of the 
MHV leader (cL) and to TRS sequences, particularly the consensus (3’-AGAUUUG-S’) 
sequence of MHV RNA located at the 3’ end of the genome.” Site-specific mutations of 
TRSs inhibited the mRNA transcription from MHV DI RNA, in direct proportion to the 
extent of reduction of hnRNPAI binding to cL.” The effect of hnRNP Al on MHV 
RNA transcription was further confirmed in cell lines.°’ Direct evidence for a functional 
role for hnRNP Al in MHV synthesis has been demonstrated in MHV-infected cells.’° 
Binding of hnRNP Al to a TRS also correlates with the efficiency of transcription from 
that TRS.°””> In addition, a C-terminus-truncated hnRNPA1 mutant exhibited dominant- 
negative effects on viral genomic RNA replication and subgenomic transcription. 
Therefore, hnRNP Al may regulate CoV RNA-dependent transcription. 

(ii) The N protein of MHV binds the UCUAAAC sequence of the leader RNA, and it 
has been suggested that N protein is involved in MHV RNA transcription.*”” The role of 
N protein in MHV RNA replication has also been shown in an in vivo replication 
system.** These findings suggest that both cellular hnRNPAI and viral N protein are 
components of the MHV replication and transcription complex. As hnRNPA1 interacts 
with some serine-arginine (SR)-rich proteins,” and because N protein also contains an 
SR motif,’””’ it has been proposed that hnRNPA1 interacts directly with N protein to 
bring the leader RNA to the CS sequence of the template RNA for initiation of 
subgenomic mRNA transcription. In fact, it has been shown” that N protein interacts 
with hnRNPA1 both in vivo and in vitro. The data was confirmed using the two-hybrid 
system. In agreement with these results, we have shown by Far-Western blotting that 
TGEV N protein binds PTB.” 

Template switch during transcription may be aided by the chaperone activity of CoV 
N protein. During negative-strand RNA synthesis, a template switch is required to add a 
negative copy of the leader to the negative strand. This process represents a displacement 
of the former template RNA by another one, the leader sequence. These types of 
processes need to overcome an energy barrier threshold. RNA chaperones are RNA 
binding proteins that may help to overcome this threshold. We have shown that TGEV N 
protein is an RNA chaperone that is also active in viral RNA annealing (Zufiga et al. in 
this book”). 

(iii) PTB binds to the c3’-untranslated region (UTR) of MHV inducing a 
conformational change in RNA structure.“ Mutations of the PTB-binding site in either 
5’-leader or the sequences complementary to 3’-UTR inhibited replication and 
transcription of MHV genomic and defective-interfering (DI) RNA, in direct proportion 
to the extent of reduction of PTB binding, suggesting that PTB plays a role in regulating 
viral RNA synthesis. Thus, the interaction of N protein with PTB may modulate 
transcription.” 
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(iv) SYNCRIPT (p70) is a member of the hnRNP family and localizes largely in the 
cytoplasm. The p70 cross-linked to MHV positive- or negative-strand RNA. The p70- 
binding site was mapped to the leader sequence of the 5’-UTR, requiring the UCUAA 
repeat sequence. Overexpression of p70 inhibited syncytium formation induced by MHV. 
Furthermore, downregulation of the endogenous p70 with a specific short iRNA delayed 
MHV RNA synthesis. These results suggest that p70 may be directly involved in MHV 
RNA replication as a positive regulator.*° 

(v) EAV nsp1 has been proposed to couple genome replication and transcription.*! 
Nsp-1 has been shown to interact with p100,* and nsp1-p100 interactions have been 
speculated to be important for viral sgRNA synthesis, either directly or by recruiting a 
p100 binding protein to the viral RdRp complex. Alternatively, nsp1 might modulate 
transcription in the infected cell, explaining why the protein is targeted to the nucleus.'° 


4. CONCLUSIONS 


The precise role of the described protein and other viral and cellular proteins needs to 
be confirmed in the context of discontinuous transcription during the synthesis of the 
negative strand, giving special attention to intermediates of the replicase processing and 
to proteins associated with membrane structures located in the cytoplasm. Functional 
proteomics could be of great help in this complicated task. In addition, the establishment 
of in vitro replication and transcription systems will help to clarify the mechanisms 
involved in CoV replication. 
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